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ABSTRACT 

The No Child Left Behind Act of 2001 (NCLB) demands from the 
American public school system that all students, regardless of race or 
socioeconomic status, must be held to the same academic expectations, and 
that their academic progress must be measured using a newly refined concept 
of adequate yearly progress (AYP) . Success in complying with the law will no 
longer be based on whether a state has created academic standards and 
testing, but rather on how well its students are doing in making real 
progress toward meeting these standards. The new system has a built-in 
"specific ambiguity, " whereby states have significantly flexibility in 
developing state accountability systems and general program administration. 
States can thus experiment with their specific implementation of AYP within 
constraints set by the law. Academic achievement standards must describe 
basic, proficient, and advanced levels of achievement, and utilize them for 
all groups of students to prevent failing groups from being hidden. Schools 
that do not make adequate yearly progress for 2 consecutive years will be 
identified as needing improvement, with corrective action being taken after 4 
years of failure. Schools that make or exceed AYP may receive special awards, 
and their teachers may receive financial awards. (Contains 12 references.) 
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Adequate Yearly Progress: Results, not Process 



By Lisa Graham Keegan, Billie J. Orr & Brian J. Jones 
Education Leaders Council 

When President Bush signed the No Child Left Behind Act of 2001 (NCLB) into law on 
January 8, 2002, he brought to the public school system a new demand. All students — 
regardless of race or socioeconomic status — must be held to the same academic 
expectations, and all students — regardless of race or socioeconomic status — must have 
their academic progress measured using a newly-refined concept of adequate yearly 
progress (AYP). 1 

The term AYP should be nothing new to educators. Title I of the previous version of the 

Elementary and Secondary Education Act, the Improving America ’s Schools Act (LASA) 

of 1994, introduced the concept of adequate progress in its requirements that all states 

establish academic content standards, develop tests to assess student progress in those 

standards, and create performance standards for those tests. But the focus of the 1994 

law centered much more on the process of building the AYP mechanism that would be 

used to measure achievement in Title I schools and for Title I students than it did on 

ensuring actual academic progress for all students. Consequently, most states have dual 

accountability systems in place — one for Title I schools and another for all public 

schools. In 2000, only 22 states had a single, unified system to judge the performance of 
2 ^ 

all public schools. 

With NCLB, all this changed. The play is no longer the thing; success in complying with 
the law will no longer be based upon whether a state has created academic standards and 
testing, but rather on how well all of its students are doing in making real progress toward 
meeting those standards. That means testing all students, and it means using the same 
system for all students; thus NCLB requires states to use a single accountability system 
for all public elementary and secondary schools to determine whether all students are 
making progress toward meeting state academic content standards. 

This expectation defined by NCLB — that all children will make continuous progress 
toward proficiency on state standards — is the underlying motive behind the new AYP. 
The goal is to ensure that all students, regardless of what they look like or how much 
money their parents earn, make adequate yearly progress, period. “All students can 
learn” is no longer just a mantra, it’s a goal that will be measured every year. 

The AYP process sounds relatively straightforward: States set the bar for what is deemed 
“proficient” in relation to their academic standards. They must then define what level of 
improvement will be sufficient each year to determine not only whether districts and 
schools have made “adequate yearly progress” toward meeting the standard of 
proficiency, but also the rate at which they will get all students to proficiency in twelve 

1 No Child Left Behind Act, P.L. 107-110, 107 th Congress, 1 st Session, 2001. 

2 Margaret E. Goertz and others, ‘‘Assessment and Accountability Systems in the 50 States: 1999-2000" 
(University of Pennsylvania: Consortium for Policy Research in Education, 2001), 30. 
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years. Finally, after testing students each year, states will disaggregate the testing results 
to determine how specific populations of students are achieving at the state, district, and 
school levels, and make those results available to the public. This is simple in 
description, but complicated in execution — and, ultimately, central to the law. AYP is 
used throughout NCLB to determine compliance, rewards, and sanctions. Process is not 
enough; it’s results that count. 

Precisely how we define results — even when it comes to such seemingly simple tasks as 
defining terms like proficient or adequate — will be decided in collaboration with the U.S. 
Department of Education and the states. While this law gives strong guidance, we would 
all do well to approach this collaborative process with humility. State accountability 
systems that seek to ensure the academic success of all students are still relatively new 
and unstudied phenomena. Our experience to date has given us much confidence that the 
broad infrastructure of NCLB is sound, but there is still much to learn and many ways to 
approach the requirements of this new law. 

Defining a System: “Specific Ambiguity” 

Under NCLB, Congress provided the states with significant flexibility in developing 
state accountability systems, and with greater flexibility in general program 
administration than has previously been permitted in federal education law. For example, 
State and local education agencies will be allowed for the first time to shift up to 50 
percent of their non-Title I administrative funds between programs, or they may even 
shift these funds into Title I itself (though they cannot move funds out of Title I to other 
accounts). States can also apply to receive “flexibility authority,” which will be awarded 
to seven states on a competitive basis to demonstrate even greater gains with greater 
freedom. 

Consistent with this new flexibility, while the objectives of the AYP requirements in 
NCLB are obvious as general guidance, they leave a great deal of room for interpretation 
in their specific implementation. For this reason, the U.S. Department of Education will 
be issuing further instruction on many of the details of the law. We would advise those 
involved in the rulemaking and guidance process to proceed cautiously, for the very 
vagueness of the law — this “specific ambiguity” — is actually an asset, as it leaves each 
state room to experiment within its own strengths and limitations. Rulemakers should not 
eliminate the desired and intentional ambiguity of the law; rather, they should jointly be 
seeking ways to learn from it. As Thomas J. Kane noted in an analysis of the House and 
Senate AYP proposals, 

...states are currently experimenting with a wide range of 
different types of accountability systems. They should be 
allowed to continue experimenting, until the Nation reaches 
a consensus regarding the ideal way to determine which 
schools are making adequate yearly progress and which are 
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not.... [I]mpatience is an insufficient excuse for bad 
education policy. 3 

While NCLB defers in certain respects to state policies and practices, it does lay down 
some non-negotiable directives that states must adhere to in their efforts to develop an 
AYP process. One might compare this to a road map on which main thoroughfares and 
destination are clearly marked, but unmarked side streets and alleys are also open to 
travel along the way. 

Under the law, each state is required to work with its teachers, parents, principals and 
local educational agencies to create a state plan that incorporates challenging academic 
content standards and student achievement standards that apply to all children within the 
state. The academic achievement standards (formerly called performance standards) 
must describe basic, proficient and advanced levels of achievement. As stated 
previously, this is crucial to understanding the concept of AYP, because the goal is for all 
children to reach the proficient level (or beyond). The state must also implement a single 
accountability system that ensures that its schools, districts and the state as a whole make 
adequate yearly progress. 

Further, while each state is responsible for the specifics in defining how it will determine 
“progress,” the federal law is clear that the state’s definitions of AYP must have the 
same high standards of achievement for all public schools in the state, and they must 
follow a 12-year timeline for getting all students to proficiency. The state’s criteria must 
be statistically valid and reliable, require continuous and substantial improvement for all 
students, and measure progress based on state reading and mathematics tests. Secondary 
schools must include graduation rates as a factor in determining progress, and elementary 
schools must use one additional indicator such as attendance, promotion rates or 
increases in participation in advanced classes. 

Data from the 2001-2002 school year will establish the starting point for measuring the 
percentage of students meeting or exceeding the state’s level of proficiency. States must 
set the initial bar at a level based on either its lowest achieving demographic group, or the 
scores of its lowest achieving schools, whichever is higher. However, regardless of 
where the initial bar is placed, states must define AYP so that all students in all groups 
are expected to improve and achieve the proficiency level in 12 years. 4 The law is 
specific in this goal, but ambiguous in the starting point, deferring to the states for the 
criteria they will use for the initial placement of the bar. 

Once the starting level has been determined, states must then begin raising the bar over 
time, increasing the number of students meeting or exceeding the state’s level of 
proficiency over time, with the goal being 100% of students at proficiency in 12 years. 
The statute requires that the bar be raised in equal increments over time, and must be 
raised for the first time not later that two years into the process, and then again at least 

3 Thomas J. Kane and others, "Assessing the Definition of 'Adequate Yearly Progress ’ in the House and 
Senate Education Bills. ” (Los Angeles: School of Public Policy and Social Research, UCLA, 2001), 12. 

4 No Child Left Behind Act, P.L. 107-110, Section 1111 (b)(2), 107 th Congress, 1 st Session, 2001. 
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once every three years. Where states have leeway is in determining the initial “height” of 
the bar, and the rate at which it will be raised over time until 100% of students reach 
proficiency. 

Finally, to ensure that the most disadvantaged students do not get left behind in this 
process — so that states and schools don’t get the more affluent children to proficiency 
first, then go back and start working on at-risk children in the waning years of the 1 2 year 
deadline — states must include separate measurable objectives for “continuous and 
substantial improvement” in both reading and math for students who are minorities, poor, 
disabled, or of limited-English proficiency (LEP). This is how states can monitor how 
well they are doing in closing the achievement gap. 

The bottom line is that, in order to demonstrate adequate yearly progress, the state and its 
districts must show that schools are meeting or exceeding the state annual measurable 
objectives for all students and for students within each subgroup. 

It is important to note that there is also a “safe-harbor” provision found within NCLB, 
created to address the concern that too many schools would be identified as failing simply 
because one subgroup — for example, LEP students — failed to meet the state AYP goals. 
This provision allows schools to avoid being considered as failing so long as (in this 
particular example) the number of LEP students who are below proficiency decreases by 
10 percent when compared with the proceeding year, and if LEP students also made 
progress on one or more of the additional academic indicators listed above. The law also 
requires at least 95% of students enrolled in the school and in each subgroup take the 
state tests in order to meet the standards of AYP. 5 

As an external audit for states to gauge the quality of their own standards — to give them 
some idea of how high their bar for proficiency is set and how well they have defined 
progress toward that bar — states will be required every other year to administer the 
National Assessment of Educational Progress (NAEP) tests in reading and math. This is 
not only a significant change from prior law (where NAEP was optional and administered 
only once every four years) but a critical one. NAEP results will act as both light and 
leverage for states serious about taking a closer look at their standards and making any 
necessary modifications to ensure that they remain rigorous. 

What will an ideal system look like? Frankly, we’re not sure yet. Clearly, states will 
develop a single accountability system for all students, create definitions of progress that 
fall within federal parameters, and lay out a timeline for getting all students to 
proficiency in 12 years — and there end the details. Through NCLB, the federal 
government has said, “Here are the guidelines, the flexibility, the resources, and the 
expectations. We’ll meet you back here in 12 years, and we’ll provide you with an 
external audit through NAEP every other year, but we want 100% of your students at 
proficiency or higher.” In the meantime, states should take advantage of the specific 
ambiguity in the law and build the system that works best for them. 



5 No Child Left Behind Act , P.L. 107-110, Section 1111 (I), 107 th Congress, 1 st Session, 2001. 



Building a System: Norm- vs. Criterion-Referencing 



It is likely that the goals of AYP will be realized in ways that have not been pursued on a 
national basis, but which will be diligently pursued in individual states. Therefore, we 
would advise caution when overseeing developing systems, and not hasten to declare 
them insufficient in process so long as the outcome data they seek and produce match the 
goals and objectives of the law. Remember, this is about results, not process. 

Accountability systems are still a new science. Few have been well researched. Many 
exist on paper, though few have been employed over any significant period of time. For 
this reason, educators, testing directors, and federal officials engaged in “approving” a 
given approach would be well advised to gather all of the pertinent data currently 
available. We may be in for a few surprises. 

As an example, we hear a compelling and well-reasoned argument that the best method 
for testing students is to use a criterion-referenced test that has been tailor-made to 
directly correlate to a state’s specific standards. If that argument is universalized as a 
compliance requirement of NCLB, every state that has not yet done so must commission 
the development of a specialized criterion-referenced test for use every year, rather than 
use any number of pre-existing commercial tests. 

The argument for this approach says that only tests designed specifically around a state’s 
standards can adequately reflect student progress toward those standards. Or so current 
accountability theory seems to suggest. 

Theory is one thing, but we may miss potentially powerful state approaches if this theory 
dictates all future practice. In fact, requiring each state to develop an annual criterion- 
referenced test will immediately undermine extensive efforts already underway in states 
such as California, Arizona, and Tennessee, among others. These states currently use 
norm-referenced tests or test items to gauge academic progress down to the level of an 
individual student, and what they have found bears further study. 

Some of their preliminary data suggest that this method of analyzing student achievement 
results in data comparable in quality and result to that derived from analysis of criterion- 
referenced tests. Until there is sufficient research in this area by those who know testing 
systems best, we should avoid dismissing the use of norm-referenced tests at the outset of 
this endeavor. 

A quick look at Arizona’s testing data should show why. Arizona administers both a 
criterion-referenced test (the AIMS test, shown in the left column on the next page) and a 
norm-referenced test (SAT-9, in the right column). If we lay the results of these two tests 
next to each other — understanding that there are technical differences in the 
administration of the tests that make a perfect correlation impossible — the results are still 
remarkably similar. 6 

6 In this particular case, percentile scores have been converted to normal curve equivalents for a more valid 
comparison of criterion- and norm-referenced test scores. (See above explanation in text.) 
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Figure 1 . Results from Arizona’ s criterion-referenced test (on the left) and norm- 
referenced test (on the right) are remarkably similar. 
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It can, of course, be argued that a criterion-referenced test is more precisely matched to 
the state’s specific standards. We don’t disagree. Yet, norm-referenced tests are also 
based on a publicized set of standards, and these are generally consistent with those used 
for criterion-referenced tests. Bear in mind the goal of showing progress — a gain in 
knowledge of material deemed most essential for student success. Both a criterion- 
referenced and a norm-referenced test are made up of questions designed to make an 
effective judgment of student knowledge and skills in defmed areas. Where they differ 
most significantly is presumably in their range of difficulty. 

While a norm-referenced test seeks questions chosen to elicit a bell-shaped performance 
curve, the criterion-referenced test is made up of questions meant to match the standard. 
For norm-referenced tests, results are displayed primarily in a percentile ranking scale for 
comparison to other students, based on a nationwide “norming” population. However, 
most national norm-referenced tests also offer conversion of their percentile scores into a 
curve representing points given for every correct answer. As the Arizona data show, 
curves and performance levels for the converted norm-referenced tests nearly mirror 
criterion-referenced test results. 

An additional point bears mentioning. Based on his work in Tennessee over the past 1 5 
years, Dr. William Sanders offers the opinion that we do not need to have an 
excruciatingly tight match of state standards to specific test items. In fact, he places far 
more importance on “freshening” a test annually with new items than he does on specific 
linking to a particular standard. 7 It could well be that we have placed too much emphasis 
on states writing their own unique tests. This is yet another assertion that deserves 
additional study. 

We are not arguing that criterion-referenced tests and norm-referenced tests are 
interchangeable. They are designed for different purposes and with distinct strengths and 
weaknesses, but the assumption that a state-developed criterion-referenced test better 
identifies student growth than a norm-referenced “test off the shelf’ may not withstand 
in-depth analysis. The data produced by both norm- and criterion-referenced tests are so 
strikingly similar that an automatic preference for use of a criterion-referenced test to 
gauge student progress as part of NCLB seems unwarranted for the moment. 

A final word in this regard: Those of us who support NCLB clearly believe that the core 
set of knowledge we seek for our students is sufficiently similar as to be assessable with a 
more generalized examination — otherwise, why the prominent role of the National 
Assessment of Educational Progress (NAEP) as an external audit for states in the new 
law? One cannot argue that gain can only be viewed within the confines of unique state 
assessments while simultaneously extolling the ability of NAEP to judge achievement 
across the board. 



7 Education Commission of the States. A Closer Look: State Policy Trends in Three Key Areas of the Bush 
Education Plan — Testing, Accountability and School Choice. (Denver: Education Commission of the 
States, 2001), 8. 



The conclusion? We need more comparison and research regarding what these tests tell 
us. There are presently a number of states that not only use both norm- and criterion- 
referenced tests, but they also use them in different subjects, different grades, and, in 
some cases, in different locations around their state. Equating the results of this blend of 
norm- and criterion-referenced testing may be valid — and then again it may not. Until 
we have more data from the administration of these tests, and the opportunity to look at 
this data in a meaningful way, we ought not be in a hurry to junk the use of norm- 
referenced tests. Educators should currently worry less about whether a test is norm- or 
criterion-referenced, and concentrate instead on its relationship to state goals, and to 
collecting and analyzing the results of those tests in meaningful ways. We’re looking at 
progress, not process. 

High Stakes and Consequences 

AYP requires states to disaggregate test results not only by communities and schools but 
also by specific sub-groups of students. Such disaggregation gives educators and parents 
a truer idea of what is really going on in their school — after all, a school that appears to 
be making progress when one looks at its average score may also show, upon closer 
examination, that certain groups of students have made little or no gains. Disaggregation 
of results is a necessary tool of accountability to ensure that schools do not hide failing 
groups of students behind the law of averages. 

So, what happens if students in a school or in a particular subgroup do not meet or exceed 
the state’s defined standard for AYP? The answer is simple: that school would not make 
adequate yearly progress. The NCLB is very clear about the consequences that such 
schools will face, and the stakes are high. 

If schools and districts do not show gain over a defined period of time, action will be 
taken on behalf of the students in those schools, including mandatory public school 
choice and the provision of individual supplemental services purchased with Title I 
funds. In addition, chronically failing schools face the very real possibility of having 
their schools completely restructured, while states that fail to meet their obligations under 
their state plan risk the loss of federal administrative dollars. 

These potential penalties resonate loudly with schools, districts and states, and they send 
a clear message to parents that the law is serious about providing them opportunities to 
remove their children from consistently- failing schools. In a welcome break with past 
policy, school failure will result in meaningful consequences, and will empower parents 
to immediately remove their children from failing schools, instead of consigning them to 
continued failure. Further, in a contrast to the overall mood of NCLB, the timelines and 
sanctions imposed for school failure are specific and non-negotiable, as they should be. 
There is simply no more room for flexibility when it comes to consequences for failing 
schools. 

If a school fails to make adequate yearly progress for two consecutive years, it will be 
identified by the district and state as needing improvement. This identification will mean 
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that federal funds will be available to states and districts to provide schools with technical 
assistance to improve academic achievement — but financial assistance alone is no longer 
seen as a sufficient tonic for the ailment. The school is also subject to stricter and more 
rigorous sanctions to ensure that change occurs as quickly as possible. After two years of 
failure, the district is required to create a plan to turn the school around and to offer 
public school choice to all students in the failing school by the beginning of the next 
school year. Further, the district must pay the costs of transporting any students who opt 
to attend a different public school, including public charter schools. 

If a school fails to make adequate yearly progress for three consecutive years, it must not 
only continue to offer public school choice for all students, but must also allow 
disadvantaged students in the failing school to use Title I funds to pay for supplemental 
services from a provider of choice. Schools will be required to set aside 20 percent of 
their total Title I allocation to pay for both the supplemental services and transportation to 
these services. Not less than 5 percent must be used for each. 

After four years of failure to make adequate yearly progress, districts are required by law 
to implement corrective action in their school. This means that, in addition to continuing 
the provision of public school choice and supplemental services, districts must intervene 
more forcefully. This could mean removing school staff, changing school leadership, or 
altering curriculum and programs. Finally, to stem the tide of continuous failure, any 
schools that fail to make adequate progress for five consecutive years would be 
completely restructured. This might mean a state takeover, alternative governance, 
private management, new staff, or becoming a charter school. In essence, they will begin 
anew. 

Schools will be released from the “corrective action” category only after making 
adequate yearly progress for two consecutive years. 

With the enactment of NCLB, these consequences go into immediate effect for schools 
that have already been identified as in need of improvement under the IASA. These 
schools — some 6,700 of them 8 — are considered to be in their first year of school 
improvement (in 2001-2002) and must offer public school choice in the coming school 
year (2002-2003). Likewise, the 3,000 schools that are already in their second year of 
school improvement under the previous law must provide individual student services to 
supplement the regular school day in addition to public school choice for all low-income 
students in the coming year. This means students who have been in schools identified as 
failing for two or three years will receive immediate help through NCLB. The clock does 
not start over for these students, and failing schools do not receive an amnesty period 
simply because the law changed. 

Just as schools are held to showing results under the AYP process, so too are school 
districts and, ultimately, the state. The state, usually through its state department of 

8 House Committee on Education and the Workforce, Press Release: H.R. 1 Education Reforms Would 
Mean Immediate New Options for Students In Thousands of Failing Schools — Beginning in 2002, 

December 13, 2001. 
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